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ABSTRACT 

Writing  (meta-) programs  that  manipulate  other  (object-) 
programs  poses  significant  technical  problems  when  the  object- 
language  itself  has  a  notion  of  binders  and  variable  occur¬ 
rences.  Higher-order  abstract  syntax  is  a  representation  of 
object  programs  that  has  recently  been  the  focus  of  several 
studies.  This  paper  points  out  a  number  of  limitations  of 
using  higher  order  syntax  in  a  functional  context,  and  ar¬ 
gues  that  DALI,  a  language  based  on  a  simple  and  elegant 
proposal  made  by  Dale  Miller  ten  years  ago  can  provide 
superior  support  for  manipulating  such  object-languages. 
Miller’s  original  proposal,  however,  did  not  provide  any  for¬ 
mal  treatment.  To  fill  this  gap,  we  present  both  a  big-step 
and  a  reduction  semantics  for  DALI,  and  summarize  the  re¬ 
sults  of  our  extensive  study  of  the  semantics,  including  the 
rather  involved  proof  of  the  soundness  of  the  reduction  se¬ 
mantics  with  respect  to  the  big-step  semantics.  Because  our 
formal  development  is  carried  out  for  the  untyped  version 
of  the  language,  we  hope  it  will  serve  as  a  solid  basis  for 
investigating  type  system(s)  for  DALI. 

1.  INTRODUCTION 

Programs  are  data.  Nothing  makes  this  point  stronger  than 
the  ever  increasing  need  for  reliable  programs  with  verified 
properties.  As  software  systems  become  more  complex,  and 
play  increasingly  important  roles  in  critical  systems  there  is 
an  ever  increasing  need  for  optimizing,  analyzing,  verifying 
and  certifying  software. 

Each  one  of  these  tasks  involves  automatic  manipulation 

*The  complete  technical  development  appears  in  a  technical 
report  available  online.  This  paper  focuses  on  the  describ¬ 
ing  DALI  from  the  point  of  view  of  language  design  and 
programming. 
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of  programs  or,  meta-programming.  As  with  any  kind  of  pro¬ 
gramming,  effective  meta-programming  relies  heavily  on  the 
presence  of  the  appropriate  support  from  the  (meta-)  pro¬ 
gramming  language.  The  goal  of  this  paper  is  to  advocate  a 
novel  approach  to  representing  programs  in  a  manner  supe¬ 
rior  to  the  main  contenders  available  today.  Our  approach 
gives  rise  to  a  simple  equational  theory  that  can  be  used  to 
reason  about  the  program  equivalence  of  meta-programs. 

1.1  Meta-Programming  as  Programming 

It  is  our  thesis  that  traditional  programming  language  tech¬ 
niques,  including  those  from  the  operational,  categorical,  ax¬ 
iomatic,  and  denotational  traditions  can  be  applied  equally 
effectively  to  meta-programming  languages  [45].  In  many 
instances,  this  means  that  the  technical  challenge  is  “inter¬ 
nalizing”  various  meta-level  operations,  such  as  quotation 
[49],  evaluation  [48;  29;  4;  45;  47],  and  type  analysis  [44; 
46],  into  a  formal  programming  language,  and  subjecting 
them  to  the  same  high  standards  developed  by  the  seman¬ 
tics  community.  This  approach  has  numerous  pragmatic 
benefits,  including: 

1.  We  succeed  in  magnifying  the  subtle  features  of  the 
operations  under  investigation,  and,  often  times,  in 
addressing  them  in  a  systematic  and  complete  man¬ 
ner.  From  the  software  engineering  point  of  view,  this 
translates  into  enhanced  safety  and  reliability. 

2.  We  succeed  in  assigning  a  uniform  semantics  to  these 
operations  that  must  otherwise  be  carried  out  in  an 
ad  hoc  fashion.  This  can  be  done  to  the  extent  that 
we  can  provide  mathematically  verified  reasoning  prin¬ 
ciples  for  these  operations  in  the  form  of  equational 
theories.  From  the  software  engineering  point  of  view, 
this  translates  into  enhanced  correctness. 

3.  We  make  these  operations  available  to  the  programmer 
in  a  uniform  way,  thus  providing  more  him  or  her  with 
more  control  over  the  behavior  of  the  system.  From 
the  software  engineering  point  of  view,  this  translates 
into  enhanced  predictability. 

1.2  Synthesis  vs.  Analysis 

There  are  two  different  kinds  of  program  manipulation:  pro¬ 
gram  synthesis ,  and  program  analysis.  The  combination  of 
the  two  is  necessary  for  expressing  general  program  trans¬ 
formations.  In  what  follows  we  outline  the  state  of  the  art 
in  language  support  for  both  synthesis  and  analysis,  and  ex- 


plain  how  the  present  work  on  DALI  fits  in  the  context  of 
analysis. 

1.2.1  Synthesis  and  Multi-Stage  Programming 
Many  recent  studies  have  concentrated  on  language  level 
support  for  program  synthesis:  works  on  multi-level  [12;  11; 
30;  28]  and  multi-stage  [49;  48;  29;  4;  45;  47]  programming 
languages  have  investigated  basic  problems  relating  to  lan¬ 
guage  support  needed  for  program  synthesis  such  as  how 
to  build  program  fragments,  how  to  combine  smaller  pro¬ 
gram  fragments  into  larger  ones,  and  how  to  execute  such 
fragments  in  a  user  friendly,  hygienic,  and  type-safe  manner. 
But  while  multi-stage  programming  constructs  provide  good 
support  for  the  construction  and  execution  of  object-code, 
they  provide  no  support  for  analysis.  In  fact,  adding  con¬ 
structs  for  analyzing  code  fragments  can  severely  weaken  the 
notion  of  observational  equivalence  in  such  languages  [47]. 

1.2.2  Analysis  and  Higher  Order  Syntax 

In  contrast,  substantially  fewer  studies  have  focused  on  lan¬ 
guage  level  support  for  program  analysis  [21;  39;  13].  With 
few  exceptions  (see  for  example  Bjorner  [5]),  the  most  pop¬ 
ular  tool  for  these  studies  has  been  higher  order  abstract 
57/niar[38]  (HO AS),  and  have  taken  place  in  the  context  of 
logic  programming  languages  [1].  In  the  remainder  of  this 
paper  we  shall  (without  drawing  too  fine  a  distinction)  re¬ 
fer  to  all  approaches  to  syntax  that  represent  object-level 
binding  constructs  by  meta-language  binding  constructs  in 
a  uniform  way  as  higher- order  abstract  syntax. 

A  program  analysis  inspects  the  structure  and  environ¬ 
ment  of  an  object-program  and  computes  some  value  as  a 
result.  Results  can  be  data-  or  control-flow  graphs,  or  even 
another  object-program  with  properties  based  on  the  prop¬ 
erties  of  the  source  object-program.  Examples  of  these  kinds 
of  meta-systems  are:  program  transformers,  optimizers,  and 
partial  evaluation  systems  [22]. 

Program  analyses  are  particularly  difficult  to  write  cor¬ 
rectly  if  they  must  manipulate  terms  that  have  a  notion  of 
statically  scoped  variables.  The  exact  representation  of  the 
variable  is  generally  uninteresting,  and  often  requires  sub¬ 
tle  administrative  changes  so  that  it  maintains  its  original 
“meaning” . 

The  primary  example  of  such  administrative  changes  is 
a  renaming  when  the  “direct”  representation  of  variables  is 
used,  and  “shift”  and  “lift”  operations  when  de  Bruijn  in¬ 
dices  are  used.  The  first  representation  relies,  typically,  on 
the  use  of  state,  a  “gensym”  operation,  and  the  second  rep¬ 
resentation  is  generally  considered  “too  human  unfriendly” . 
Because  of  this,  representing  object-programs  using  first  or¬ 
der  algebraic  data  structures  which  use  strings  or  other 
atomic  values  to  represent  variables  are  notoriously  hard  to 
manipulate  correctly. 

A  more  pressing  concern  is  that  implementing  such  oper¬ 
ations  once  is  not  enough:  They  need  to  be  implemented  for 
each  object-language  that  has  binding  constructs.  The  basic 
problem  is  therefore  pervasive,  it  appears  in  almost  every 
interesting  language. 

The  basic  idea  that  we  advocate  is  to  (uniformly)  exploit 
the  binding  mechanism  of  the  meta-language  to  implement 
the  binding  mechanism(s)  of  the  object-language,  i.e.  use 


functions  in  the  meta-language  to  implement  binding  in  the 
object-language.  At  first  glance,  this  looks  like  a  promising 
idea,  but  a  number  of  subtle  problems  arise.  We  explicate 
these  problems  carefully  in  Section  3.  The  problems  arise 
because  the  functions  of  the  meta-language  have  two  prop¬ 
erties  which,  while  necessary  for  their  use  as  functions,  get 
in  the  way  of  their  use  as  binding  mechanisms.  These  prop¬ 
erties  are:  extensionality  and  delayed  computation.  Exten- 
sionality  means  that  one  cannot  observe  the  structure  of  a 
function,  other  than  by  applying  it  to  get  a  result.  Delayed 
computation  means  that  computations  embodied  in  a  func¬ 
tion  do  not  occur  until  the  function  is  applied.  What  we 
need  is  a  new  kind  of  binding,  without  these  properties. 

In  this  paper,  we  develop  such  a  binding  mechanism  by 
refining  some  ideas  of  Dale  Miller’s  [23].  This  new  binding 
mechanism  can  be  incorporated  into  a  functioned  language 
with  first-order  datatypes,  and  together  they  can  be  used  to 
represent  variable  binding  in  object-languages.  This  mecha¬ 
nism  can  be  systematically  reused.  In  addition,  we  develop  a 
sound  syntactic  system  for  reasoning  about  the  equivalence 
of  functional  programs  that  use  this  new  binding  mechanism. 


1.3  Contribution 

The  contribution  of  this  paper  is  simple  and  focused:  a  call- 
by-value  operational  semantics  for  an  untyped  functional 
programming  language  with  an  extension  that  supports  first- 
order  datatypes  (FOD)  with  binders. 

We  have  applied  the  rigorous  standards  of  language  de¬ 
sign  and  semantic  analysis  to  both  the  host  language  (the 
lambda  calculus)  and  the  extension  and  discovered  that  the 
two  axe  mutually  compatible.  The  combined  language  en¬ 
joys  a  non-trivial  equational  theory  where  beta  convertibility 
is  a  congruence,  and  is  therefore  unlikely  to  invalidate  known 
optimizations  for  a  call-by-value  functional  language. 

We  believe  that  our  present  operationally-based  study 
complements  the  recent  model-theoretic  approach  of  Gab- 
bay  and  Pitts  [17],  Hofmann  [20],  and  Fiore,  Plotkin,  and 
Turi  [16].  For  example,  whereas  Pitts  and  Gabbay’s  recent 
work  emphasizes  that  a  type  system  is  required  for  their 
language  to  ensure  that  “namefulness”  doesn’t  spread  ev¬ 
erywhere,  our  language  is  untyped,  and  does  not  appear  to 
give  rise  to  any  non-standard  “namefulness”  problems. 


2.  HOAS  V.S.  FIRST  ORDER  DATATYPES 

The  precise  semantics  of  (meta-) programs  depends  crucially 
on  the  basic  properties  of  the  representation  of  object-programs. 
This  question  of  representation  is  the  focus  of  the  present 
study. 

The  essence  of  the  representation  we  propose  goes  back  at 
least  to  Church  [8].  The  idea  is  to  exploit  the  binding  mech¬ 
anism  of  the  meta-language  to  implement  the  binding  mech¬ 
anism^)  of  the  object-language.  This  is  also  the  essence  of 
Pfenning  and  Elliot  [38]  and  Miller’  [23;  25;  26]  higher-order 
syntax  (HOAS)  representation.  To  illustrate  the  basic  idea 
of  higher-order  syntax,  consider  the  definitions  of  Term  and 
Term’  below. 


data  Term 
=  App  Term  Term 
|  Abs  String  Term 
|  Const  Int 
|  Var  String 


data  Term* 

=  App*  Term’  Term* 

|  Abs *  Term*  ->  Term* 
I  Const*  Int 


In  Term*  we  represent  the  object-language  lambda  abstrac¬ 
tion  (Abs*)  using  the  meta-language  function  abstraction. 
This  way,  functions  such  as  id  and  app  are  represented  by 
applying  the  Abs  *  constructor  to  a  meta-language  function: 


—  \  x  ->  x 

id  =  Abs  "x"  (Var  "x") 


—  \  x  ->  x 

id*  =  Abs  *  (\  x  ->  x) 


—  \f  ->  \  X  ->  f  X 

app  =  Abs  "t" 

(Abs  "x" 

(App  (Var  "f") 

(Var  Mx"))) 


—  \f  ->  \  x  ->  f  x 
app*  =  Abs*  (\  f  -> 
Abs*  (\  x  -> 
(App*  f  x))) 


The  HOAS  representation  (Term*)  is  elegant  in  that  a  con¬ 
crete  representation  for  variables  is  not  needed,  and  that  it  is 
not  necessary  to  invent  unique,  new  names  when  construct¬ 
ing  lambda-expressions  which  one  can  only  “hope”  don’t 
clash  with  other  names. 


3.  CRITIQUE  OF  HOAS 

This  flavor  of  HOAS  seems  like  a  great  idea  at  first,  but 
careful  inspection  reveals  a  few  anomalies.  It  works  fine  for 
constructing  statically  known  representations,  but  quickly 
breaks  down  when  trying  to  construct  or  observe  a  represen¬ 
tation  in  a  algorithmic  way.  We  quickly  provide  a  few  small 
examples  that  illustrate  the  problems  we  have  encountered. 

Pi  Opaqueness:  We  cannot  pattern  match  or  observe 
the  structure  of  the  body  of  an  Abs 1 ,  or  any  object- 
level  binding,  because  they  are  represented  as  func¬ 
tions  in  the  meta-language,  and  meta-level  functions 
axe  extensional. 

We  can  observe  this  by  casting  our  Term*  example 
above  into  a  real  program  formulated  in  ML,  and  notic¬ 
ing  that  id  prints  as  Abs*  fn. 

(*  Actual  ML  Program  Execution  *) 

-  datatype  Term* 

=  App*  of  (Term**  Term1) 

I  Abs  *  of  (Term*  ->  Term1) 
j  Const*  of  int; 

-  val  id  =  Abs*(fn  x  =>  x) ; 
val  id  =  Abs*  fn  :  Term* 

P2  Junk  [7;  6]  :  I.e.,  there  are  terms  in  the  meta-language 
with  type  Term*  that  do  not  represent  any  legal  object- 
program.  Consider: 

junk  =  Abs  * (\  x  ->  case  x  of  App*  f  y  ->  y 

;  Const*  n  ->  x 
;  Abs*  _  ->  x) 

No  legal  object-program  behaves  in  this  way. 

P3  Latent  Divergence:  Because  functions  delay  com¬ 
putation,  a  non-terminating  computation  producing  a 
Term*  may  delay  non-termination  until  the  Term*  ob¬ 
ject  is  observed.  This  may  be  arbitrarily  far  from  its 
construction,  and  can  make  things  very  hard  to  debug. 
Consider  the  function  bad  below: 


bad  (Const*  n)  =  Const’  (n+1) 

bad  (App*  x  y)  =  App*  (bad  x)  (bad  y) 

bad  (Abs*  f)  =  Abs*(\  x  ->  diverge (bad  (f  x))) 

bad  walks  over  a  Term*  increasing  every  explicit  con¬ 
stant  by  one.  Suppose  the  programmer  made  a  mis¬ 
take  and  placed  an  erroneous  divergent  computation  in 
the  Abs  *  clause.  Note  that  bad  does  not  immediately 
diverge. 

P4  Expressivity:  Using  HOAS,  there  exist  (too  many) 
met  a- functions  over  object-terms  that  cannot  be  ex¬ 
pressed.  Consider  writing  a  show  function  for  Term* 
that  turns  a  Term’  into  a  string  suitable  for  printing. 

show  (App*  f  x)  =  (show  f)  ++  "  "  ++  (show  x) 

show  (Const*  n)  =  toString  n 

show  (Abs*  f)  =  "W  "++  ?v  ++  "  ->  " 

++  (show  (f  ?v)) 

What  legal  meta-program  value  do  we  use  for  ?v?  We 
need  some  sort  of  “variable”  with  type  Term*  but  no 
such  thing  can  be  created.  There  are  “tricks”  for  solv¬ 
ing  this  problem  [14],  but  in  the  end,  they  only  make 
matters  worse. 

Our  approach  to  these  problems  is  to  cast  our  search 
for  solutions  as  an  exercise  in  programming  language  de¬ 
sign.  The  following  subsections  offer  an  informal  discussion 
of  each  problem  and  a  potential  solution  by  the  introduc¬ 
tion  of  additional  language  features,  and  provide  examples 
of  how  these  language  features  might  be  used.  Our  biggest 
challenge  is  to  discover  features  that  interact  well,  both  with 
each  other,  and  with  the  existing  features  of  the  language 
we  wish  to  add  them  to. 

3.1  Opaqueness 

To  solve  the  opaqueness  problem  a  number  of  researchers 
have  investigated  the  use  of  higher-order  pattern  matching 
[32].  The  basic  idea  is  that  programmers  use  a  higher-order 
interface  to  the  object-language  because  it  is  expressive  and 
easy  to  use,  but  the  actual  underlying  implementation  is 
first  order. 

One  tries  to  supply  an  enriched  interface  that  gives  pro¬ 
grammers  access  to  this  first-order  implementation  in  a  safe 
manner,  that  still  supports  all  the  benefits  of  a  higher-order 
implementation.  To  illustrate  this  consider  the  (not  neces¬ 
sarily  semantics-preserving)  rewrite  rule  f  for  object-terms 
Term*,  which  might  be  expressed  as: 
f:  (Ax.(e'0))^(e'[0/x]). 

Here,  we  use  the  notation,  that  a  primed  variable  is  a 
meta-variable.  Thus  e'  is  a  meta-variable  of  the  rule,  and 
e'[0/x]  indicates  the  capture  free  substitution  of  0  for  x  in 
eh 

Higher- order  pattern  matching  is  a  programming  language 
mechanism,  that  allows  us  to  express  that  we  wish  to  observe 
the  inner  structure  of  meta-language  abstraction,  and  that 
parts  of  the  body  of  this  abstraction  (i.e.  e')  may  have  free 
occurrences  of  x  inside. 

We  use  a  higher-order  pattern  when  we  wish  to  analyze 
the  structure  of  a  constructor  like  Abs  *  which  takes  a  meta¬ 
function  as  an  argument.  Like  all  patterns,  a  higher  order 


pattern  “binds”  a  meta-variable.  The  meta-variable  bound 
by  a  higher-order  pattern  does  not  bind  to  an  object-term, 
but  instead  binds  to  a  function.  This  function  captures  the 
subtlety  that  e'  might  have  free  occurrences  of  x.  Given 
the  bound  variable,  as  input,  it  reconstructs  the  body  of  the 
abstraction.  Given  a  term  as  input,  it  substitutes  the  term 
for  each  free  occurrence  of  the  bound  variable  in  the  body. 

The  bound  meta- variable  is  a  function  from  Term’  -> 
Terra’.  We  make  this  language  mechanism  concrete  by  ex¬ 
tending  the  notion  of  pattern  in  our  meta-language.  Pat¬ 
terns  can  now  have  explicit  lambda  abstractions,  but  any 
pattern- variables  inside  the  body  of  the  lambda  abstraction 
are  higher-order  pattern-variables,  i.e.  will  bind  to  func¬ 
tions.  Consider  below,  an  example  implementing  the  rewrite 
rule  above. 

f  (Abs ’ (\  x  ->  App’(e’  x) (Const  0)))  =  e’ (Const  0) 
f  x  -  x 

In  this  example  the  meta-function  f  matches  its  argu¬ 
ment  against  an  object-level  abstraction  (Abs  ’  . . . )  using 
an  object-level  pattern.  The  pattern  specifies  that  the  body 
of  the  matched  abstraction  must  be  an  application  (App’)  of 
a  function  term  (e’  x)  to  a  constant  (Const  0).  The  func¬ 
tion  part  of  this  object- application  can  be  any  term.  This 
term  may  have  free  occurrences  of  the  object-bound  variable 
(which  we  write  as  x  in  the  pattern,  but  which  can  have  any 
name  in  the  object-term  it  matches  against).  Because  of 
this  we  use  a  higher-order  pattern  (e  ’  x)  which  applies  e  to 
x  to  indicate  that  e’  is  a  function  whose  argument  is  the 
object-bound  variable  x. 

This  extension  differs  from  normal  pattern  matching,  in 
that  neither  meta-level  abstractions  (\  x  ->  .  .  .)  nor  appli¬ 
cations  of  met  a- variables  (e’  x)  normally  appear  in  regular 
patterns. 

If  the  underlying  implementation  is  first  order  (like  Term), 
patterns  of  this  form  have  an  efficient  and  decidable  imple¬ 
mentation.  The  clause 

f  (Abs ’ (\  x  ~>  App’(e’  x) (Const  0)))  =  e’  (Const  0) 
would  translate  into  an  implementation  using  Term  as  fol¬ 
lows: 

f  (Abs  x  (App  e  (Const  0)))  = 

let  e’  y  =  subst  [(x,y)]  e 
in  e’  (Const  0) 

The  key  advantage  of  this  approach  is  that  users  get  to  use 
the  expressive  and  safe  HOAS  interface,  and  the  substitution 
function  need  not  be  written  by  the  programmer  but  can  be 
supplied  by  the  underlying  implementation. 

The  solution  of  using  (a  hidden)  underlying  first  order 
implementation,  but  supplying  a  higher-order  interface,  ex¬ 
tends  nicely  to  term  construction  as  well  as  term  observa¬ 
tion. 

A  construction  like:  (Abs’  f)  : :  Term’  could  be  trans¬ 
lated  into  an  underlying  implementation  based  on  first-order, 
observable,  data-structures  (i.e.  Term)  by  using  a  gensym 
construct  to  provide  a  “fresh”  name  for  the  required  object- 
bound  variable: 

let  y  =  gensym  ()  in  Abs  y  (f  (Var  y)). 


Again  both  the  gensym  and  the  underlying  first-order  im¬ 
plementation  is  hidden  from  the  user. 

3.2  Junk 

Junk  is  a  serious  problem  in  that  it  allows  meta-programs  to 
represent  non-existent  terms  in  the  object-language.  Junk 
arises  because  the  body  of  an  object-binding  is  a  compu¬ 
tation  (i.e,  a  suspended  function),  rather  than  a  constant 
piece  of  data.  This  causes  two  kinds  of  problems: 

1.  The  computation  can  “observe”  the  bound  variable, 
and  do  ill-advised  things  like  pattern  matching.  A 
valid  object-binding  only  “builds”  new  structure  around 
the  variable.  It  does  not  observe  the  bound  variable. 

2.  The  computation  can  introduce  effects.  In  this  case  the 
computational  effects  of  the  meta-language,  such  as 
nontermination,  are  introduced  into  the  purely  syntac¬ 
tic  representation  of  the  object  language.  Even  worse, 
the  effects  are  only  introduced  when  the  object-term 
is  observed.  If  a  term  is  observed  multiple  times,  it 
causes  the  effects  to  be  introduced  multiple  times. 

3.3  Latent  Divergence 

So  we  see  that  that  junk  and  latent  divergence  are  really 
two  facets  of  the  same  problem.  To  fix  these  problems  we 
need  a  binding  construct  which  preserves  static  scoping  (like 
normal  meta-level  functions)  but  which  does  not  delay  com¬ 
putation.  What  we  need  is  a  binding  construct  which  forces 
computation  “under  the  lambda”  [45]. 

Ten  years  ago,  Dale  Miller  proposed  a  new  meta-level 
binding  construct  for  implementing  HOAS  in  ML  [23]  which 
did  exactly  this.  He  introduced  a  new  binary  type  construc¬ 
tor  (a  =>  b)  which  names  the  type  of  an  object  level  binding 
of  a  terms  in  b  terms.  The  new  type  constructor  was  used  in 
place  of  the  function  type  constructor  to  denote  object-level 
abstraction. 

We  introduce  DALI,  a  language  based  upon  a  refinement 
of  Miller’s  idea.  We  compare  it  to  HOAS,  and  illustrate  its 
intended  use  by  a  number  of  examples.  In  DALI,  the  object¬ 
binding  mechanism  is  separate  from  the  function  construct 
of  the  meta-language.  This  allows  us  to  restrict  the  range 
of  junk,  and  the  introduction  of  erroneous  effects.  Consider 
our  small  lambda  calculus  example  once  again. 

datatype  Term 

=  App  Term  Term 

|  Abs  Term  =>  Term 

1  Const  Int 

Terms  of  type  a  =>  b  are  introduced  using  the  meta¬ 
language  construct  for  object-binding  introduction.  The 
expression  level  syntax  of  the  meta-language,  is  analogous 
to  the  syntax  of  the  type  constructor  for  object-level  bind¬ 
ings.  For  example:  (#x  =>  App (#x, Const  0))  ::  (Term 
=>  Term).  Here  we  use  the  hash  (#x)  notation  to  distin¬ 
guish  object-level  variables  from  meta-level  variables.  An 
important  property  of  object- variables,  is  that  they  can¬ 
not  escape  their  scope.  Like  meta-level  function  binding, 
object-binding  respects  static  scoping.  The  =>  introduction 
construct  (#x  =>  e)  delimits  the  scope  of  #x  to  e.  The  key 


property  of  object-binding  is  that  evaluation  proceeds  under 
=>. 

Below  are  two  different  examples  of  constructing  an  object- 
language  program.  The  first  using  meta-level  functions  as 
the  binding  mechanism,  and  the  second  using  object-level 
abstraction: 

Abs  *  (\x  ->  bottom)  Abs(#x  =>  bottom) 

The  expression  on  the  left  uses  a  meta-language  bind¬ 
ing  mechanism  (A  abstraction).  It  succeeds  in  constructing 
a  representation  of  an  object-language  program  which  obvi¬ 
ously  has  no  meaning.  The  expression  on  the  right,  however, 
does  not  represent  any  object-language  program,  since  the 
expression  never  terminates.  Note  that  the  effect  on  the  left 
has  seeped  into  the  object-language  program  representation 
(junk),  while  on  the  right  non-termination  occurs  before  the 
object-language  program  is  constructed  and  thus  is  never 
present  in  the  object-language  program  itself. 

A  more  sophisticated  example  is  the  copy  function  over 
Term 

copy  (App  f  x)  =  App  (copy  f)  (copy  y) 
copy  (Const  n)  =  Const  n 

copy  (Abs (#x  =>  e’  #x))  »  Abs(#y  =>  (copy  (e>  #y))) 
copy  (x  Q  #_)  =  x 

To  those  familiar  with  functional  programming,  the  first 
two  clauses  should  be  clear.  The  third  clause,  uses  the 
higher-order  pattern  matching  introduced  earlier,  only  ap¬ 
plied  here  in  the  context  of  the  new  object-level  binding 
construct.  Since  evaluation  passes  under  =>  diverging  com¬ 
putations  will  not  delayed. 

The  fourth  clause  of  the  copy  function  is  an  artifact  of 
the  object-level  binding  mechanism.  Object-variables  (#x) 
introduced  using  the  object-binding  syntax:  (#x  =>  .  . . ) , 
are  a  new  type  of  constant.  The  actual  name  of  such  a 
constant  is  not  accessible  to  the  programmer.  There  are  two 
operations  that  are  necessary  on  object-variables,  it  should 
be  possible  to  distinguish  them  from  other  object-terms,  and 
it  should  be  possible  to  compare  them  using  equality,  in 
order  to  tell  them  apart. 

Thus,  functions  over  object-languages,  must  have  a  clause 
for  object-bound  variables.  Object-bound  variables  are  dis¬ 
tinct  from  all  other  constructors,  and  are  common  to  all 
object-languages.  The  pattern  #_  matches  any  object-variable, 
but  fails  to  match  other  constructors.  The  binding  says 
nothing  about  the  name  of  the  variable  it  binds  to.  The  no¬ 
tation  (x  (3  #_)  introduces  a  meta-level  variable  x,  bound 
to  the  object-level  variable  matched  by  the  object-pattern 
#-. 

3.4  Expressivity 

It  is  sometimes  necessary  to  eliminate  object-bound  vari¬ 
ables.  This  is  done  in  one  of  two  ways.  First  by  applying 
a  higher-order  pattern  variable  to  some  value  x  : :  Term, 
the  occurrences  of  the  bound  variable  will  be  replaced  with 
x. 

This  is  not  always  sufficient  since  it  does  not  provide  any 
way  of  transforming  a  object-binding  into  anything  other 
than  another  object-binding.  This  was  the  problem  with 
the  show  function  (Section  3).  This  is  why  HOAS  using 


meta-level  function  binding  cannot  express  some  functions. 
Object-level  binding  allows  us  to  solve  this  problem. 

The  solution  is  a  new  language  construct  discharge.  The 
construct  (discharge  #x  =>  el)  introduces  a  new  object- 
level  variable  (#x) ,  whose  scope  is  the  body  el.  The  value  of 
the  discharge  construct  is  its  body  el.  The  body  el  can  have 
any  ground  type,  unlike  an  object-level  binding  (#x  =>  e2) , 
where  e2  must  be  an  object  term. 

In  addition,  discharge  incurs  an  obligation  that  the  object 
variable  (#x)  does  not  appear  in  the  value  of  the  body  (el) . 
An  implementation  must  raise  an  error  if  this  occurs. 

For  example  consider  a  function  which  counts  the  number 
of  Const  subterms  in  a  Term. 

count  : :  Term  ->  Int 
count  (Const  _)  =  1 

count  (App  f  x)  *  (count  f)  +  (count  x) 
count  (Abs(#x  =>  e *  #x) )  = 

discharge  #y  =>  count  (e*  #y) 
count  #_  =  0 

Note  how  that  the  fourth  clause  conveniently  replaces  all 
introduced  object-bound  variables  with  0,  thus  guaranteeing 
that  no  object- variable  appears  in  the  result.  The  obligation 
that  the  variable  does  not  escape  the  body  of  the  discharge 
construct  may  require  a  run-time  check  (though  in  this  ex¬ 
ample,  since  the  result  has  type  Int,  no  such  occurrence  can 
happen). 

If  a  programmer  needs  to  treat  individual  object-bound 
variables  in  different  ways,  he  can  use  an  environment  pa¬ 
rameter.  Consider  the  program  below,  which  is  the  correct 
implementation  of  the  function  show. 

show  x  =  sh  n  []  x 
where 

sh  n  (App  f  x)  =  (sh  n  f)  ++  "  "  ++  (sh  n  x) 
sh  n  (Const  n)  *  toString  n 
sh  n  (x  Q  #_)  =  lookup  x  n 
sh  n  (Abs(#y  =>  f  #y) )  = 
let  x  =  len  n 

v  =  x  ++  (toString  x) 
in  discharge  #x  => 

"\\  "++  v  ++  "  ->  " 

++  (show  ( (#x, v) :n)  (f  #x)) 

Here  the  environment  n  is  a  list  of  pairs  mapping  object- 
variables  to  strings.  If  the  sh  function  is  applied  to  an 
object- variable  it  looks  up  its  name  in  the  environment.  For 
an  object-abstraction,  (Abs>(#x  =>  f  #x)),  discharge  in¬ 
troduces  a  new  object-variable,  adds  it  to  the  environment, 
and  then  applies  the  higher-order  pattern  variable  f  to  the 
introduced  variable,  and  recursively  produces  a  string  as  the 
representation  of  the  abstraction’s  body. 

Another  example  transforms  a  Term  into  its  de  Bruijn 
equivalent  form. 

data  DB 

=  DApp  DB  DB 
I  DAbs  DB 
|  DVar  Int 
I  DConst  Int 

DeBruijn  env  (App  f  x)  = 

DApp  (DeBruijn  env  f)  (DeBruijn  env  x) 


DeBruijn  env  (Abs(#x  ->  e’  #x))  - 

discharge  #y  —>  DAbs (DeBruijn  (ext  env  #y)(e*  #y)) 
where  ext  env  v  u  = 

if  v=u  then  0  else  1  +  (env  u) 

DeBruijn  env  (Const  n)  =  DConst  n 
DeBruijn  env  (z  9  #_)  =  env  z 

4.  EXAMPLES 

In  this  section  we  use  our  language  to  express  some  classic 
manipulations  on  object-languages. 

•  Lambda  calculus  syntax 

datatype  Lterm  =  App  Lterm  Lterm 

|  Abs  Lterm  ~>  Lterm 
|  Const  Int 
j  Prod  Lterm  Lterm 

•  Call-by-name  Big-step  evaluator  for  untyped  lambda 
calculus: 

eval  :  Lterm  ->  Lterm 
eval  (Abs  body)  =  Abs  body 
eval  (App  tl  t2)= 
case  eval  tl  of 

(Abs  (#x  =>  body  #x))  ->  eval  (body  t2) 
eval  (Const  n)  =  Const  n 
eval  (Prod  x  y)  =  Prod  (eval  x)  (eval  y) 
eval  (x  9  #_)  =  x 

•  CBN  lambda  calculus  (single  step)  reduction: 

beta  :  Lterm  ->  Lterm  ->  Lterm 
beta  (Abs  (#z  =>  body  #z))  t2  =  body  t2 

•  Complete  development: 

compdev  :  Lterm  ->  Lterm 
compdev  (Abs(x#  =>  body  #x)) 

=  Abs (#w  =>  compdev  (body  #w) ) 
compdev  (App  (Abs(#x  =>  body  #x))  y) 

=  sub  (Abs (#w  =>  compdev (body  #w)))  (compdev  y) 
where  sub  (Abs(#z  =>  e  #z))  x  =  e  x 
compdev  (App  f  x)  =  App(compdev  f) (compdev  x) 
compdev  (Prod  x  y)  =  Prod(compdev  x) (compdev  y) 
compdev  (Const  n)  =  Const  n 
compdev  (x  9  #_)  =  x 

•  Substitution  on  Lterms. 

find  x  []  =  Nothing 

find  x  ( (y , v) : ys)  =  if  x==y 

then  v 

else  find  x  ys 

subst:  Lterm  ->  [(Lterm, Lterm)]  ->  Lterm 
subst  x  env  = 
case  find  x  env 
Just  t  ->  t 
Nothing  -> 
case  x  of 
v  9  #_  **>  v 
Abs (#x  =>  e  #x)  -> 

Abs (#w  =>  subst  env  (e  #w) ) 

App  x  y  ->  App (subst  env  x) (subst  env  y) 

Prod  x  y  ->  Prod(subst  env  x) (subst  env  y) 
Const  n  ->  Const  n 


5.  NEW  FEATURES  OF  DalI 

The  language  DALI  contains  some  features  that  behave  in 
untraditional  ways.  It  is  useful  to  call  attention  to  these 
features. 

•  Object  variable  bindings:  Unlike  the  meta-language 
binding  construct,  the  evaluation  of  an  object-level  ab¬ 
straction  ((#x  =>  e))  proceeds  “under”  the  =>. 

•  Ground  (or  equality )  values:  such  values  can  be  com¬ 
pared  for  simple  structural  equality.  The  important 
property  of  ground  values  is  that  they  do  not  contain 
functions.  Only  ground  values  are  used  to  represent 
valid  object  languages. 

In  order  to  compare  object-language  terms  for  equality 
it  is  necessary  to  compare  object-variables  for  equal¬ 
ity.  This  must  be  a  primitive  in  the  language.  Equality 
on  object-language  types  is  important  for  two  reasons. 
First,  it  facilitates  an  important  programming  tech¬ 
nique,  illustrated  in  our  de  Bruijn  notation  example 
above.  Second,  it  makes  possible  higher-order  pattern 
matching  (see  below). 

•  Object-variable  matching:  Comparing  object-language 
terms  for  equality  is  not  enough  for  the  meta-programs 
in  our  examples.  We  must  be  able  to  distinguish  object- 
level  variables  from  other  object-level  terms.  This  is 
the  purpose  of  the  (  #_  )  pattern. 

•  Higher-order  pattern  matching:  A  higher  order  pat¬ 
tern  variable  (i.e.  x  in  (\(#z  =>  x  #z)~>e)  is  bound 
to  a  (meta-level)  function  that  returns  the  body  of 
the  object-level  abstraction  after  replacing  the  object- 
variable  with  its  argument.  This  in  effects  internalizes 
substitution  for  bound  object-level  variables  in  object 
programs.  In  order  to  implement  such  a  scheme,  it 
is  important  that  the  object-abstraction  body  be  an 
equality  type.  I.e.  we  must  somehow  disallow  types 
of  the  form  (a  =>  (b  ->  c)).  If  we  do  not  do  this 
then  interesting  anomalies  may  occur.  For  example 
consider: 

f  (#z  =>  x  #z)  =  x  (Const  0) 
w  =  (#x  =>  (\  y  ->  Prod  #x  (Const  5))) 

If  we  apply  f  to  w,  we  must  build  a  meta-level  function 
x  which  replaces  all  occurrences  of  #x  in  _ 

(\  y  ->  Prod  #x  (Const  5))  with  (Const  0).  It  is 
unlikely  we  can  do  this  if  functions  are  only  exten- 
sional. 

6.  A  NOTE  ABOUT  “DISCHARGE” 

In  defining  meta-programs,  the  use  of  discharge  is  often 
crucial,  since  it  allows  for  eliminating  an  object-level  binding 
and  performing  computations  only  on  its  body.  However, 
the  binding’s  body  can  be  safely  extracted  only  if  there  is 
a  guarantee  that  a  heretofore  bound  object-level  variable 
cannot  become  free  as  a  result  of  computation  over  its  body. 
There  are  two  ways  of  adding  discharge  to  DALI:  First,  as 
a  new  language  construct  with  appropriate  reduction  rules; 
and  second,  as  a  function  defined  by  the  user  on  a  per- 
datatype  basis. 


In  the  present  paper,  we  opt  for  the  second  design  decision 
in  order  to  keep  the  core  calculus  of  DALI,  and  its  technical 
development  as  small  as  possible.  We  present  an  example 
of  a  user  defined  discharge  function  for  the  lambda-term 
datatype  (Lterm): 

discharge  (#w  =>  t  #w)  = 

case  (#a  =>  find  #a  (t  #a))  of 
(#z  =>  True  )  ->  t  () 

( #z  =>  False)  ->  diverge 

find  var  (App  tl  t2)  = 

(find  var  tl)  or  (find  var  t2) 
find  var  (Abs(#w=>b  #w))  - 

case  (#z  =>  find  var  (b  #z))  of 
(#z  =>  True  )  ->  True 

(#z  =>  False)  ->  False 
find  var  (Prod  tl  t2)  = 

(find  var  tl)  or  (find  var  t2) 
find  var  (Const  n)  -  False 
find  var  (x  <9  #_)  - 

if  x  =  var  then  True 
else  False 


The  function  discharge  simply  searches  the  body  of  an 
object-level  abstraction  for  the  abstracted  object-level  vari¬ 
able.  If  the  variable  is  not  found  in  the  body,  the  program 
simply  returns  the  body  itself.  Otherwise,  the  computation 
diverges. 

It  is  important  to  note  that  in  DALI,  discharge,  whether 
added  as  a  language  construct  or  defined  as  a  function,  can 
be  used  only  on  ground  values,  i.e.,  values  that  do  not  con¬ 
tain  suspended  computation.  Extending  discharge  to  ap¬ 
ply  to  values  that  contain  abstraction  causes  confluence  and 
soundness  problems  similar  to  those  described  in  section 
8.1. 

However,  there  appear  to  be  situations  where  such  a  more 
general  version  discharge  is  desirable.  The  example  below, 
implements  a  kind  of  evaluation  for  the  familiar  encoding  of 
untyped  lambda  terms,  using  an  environment. 

data  Value  =  Vint  Int 

I  Vprod  Value  Value 
|  Vfun  Value  ->  Value 

type  Env  *  [(Lterm  *  Value)] 

eval*  :  Env  ->  Lterm  ->  Value 
eval*  env  (Abs  (#x  =>  b  #x))  = 
discharge  #w  => 

Vfun(\  y  ->  eval *  (extend  #x  y  env)  (b  #w)) 
eval*  env  (App  tl  t2)  = 
case  eval*  env  tl  of 

Vfun  f  ->  f  (eval’  env  t2) 
eval’  env  (Const  n)  =  Vint  n 
eval*  env  (Prod  x  y) 

=  Prod(evalJ  env  x)(eval’  env  y) 
eval*  env  (x  0  #_)  =  env  x 

In  the  present  version  of  the  language  it  is  impossible  do 
define  a  discharge  function  needed  for  the  second  clause 
of  eval  * ,  since  it  would  involve  detection  of  free  object- 
bound  variables  in  a  term  that  contains  an  (extensional) 
meta-level  function.  On  the  other  hand,  the  example  is  in¬ 
tuitively  correct,  and  one  can  convincingly  argue  from  the 


definition  of  the  function  eval 1  that  the  discharged  object- 
level  variables  indeed  never  do  appear  in  the  values  of  eval ' . 
Whether  an  appropriate  mechanism  can  be  introduced  to  ex¬ 
tend  discharge  to  such  cases  remains  a  question  yet  to  be 
fully  addressed  for  DALI. 

7.  FORMAL  SEMANTICS  OF  CORE  DALI 

7.1  Syntax 

Figure  1  defines  the  various  syntactic  categories  used  in  spec¬ 
ifying  Core  DALI,  including  expressions  E,  ground  values  B, 
values  V,  and  contexts  C. 

Expressions  in  Core  DALI  include  the  lambda  calculus 
with  naturals.  Further,  the  language  incorporates  datatypes 
(not  necessarily  just  first-order),  in  addition  to  the  following 
specialized  mechanisms: 

•  Object-level  variables  and  binders  (#z  =>  e), 

•  Pattern  matching  over  object-bindings  A (#z  =>  x).e. 

•  Equality  for  object-bound  variables  =#  #z' 

•  Test  of  whether  an  expression  evaluates  to  an  object- 
level  variable  (isOVar  e). 

Values,  ground  values,  and  context  are  used  in  defining  the 
reduction  semantics. 

7.2  Core  DALI  vs.  Example  Language 

The  Core  DALI  has  two  (more  primitive)  forms  of  pattern 
matching  than  the  language  used  in  the  examples:  one  for 
tagged  values,  one  for  object-level  bindings.  A  third  form  of 
pattern  matching  (for  object-level  variables)  can  be  easily 
encoded  using  isOVar . 

Nested  patterns  are  not  allowed,  nor  are  more  complicated 
higher-order  patterns  directly  supported:  each  constructor 
has  one  argument,  and  each  higher-order  pattern  variable 
has  exactly  one  possible  free  object  variable  in  it.  These 
simplifications  make  the  formal  development  of  Core  DALI 
more  manageable,  without  losing  generality:  programs  in  a 
more  familiar  language  of  our  examples  can  be  translated 
into  equivalent,  albeit  more  verbose  Core  DALI  expressions. 

7.3  Big  Step  Semantics  (ad) 

Figure  2  defines  the  call-by-value  (CBV)  big-step  semantics 
for  Core  DALI.  Note  that  this  semantics  does  not  require  a 
gensym  function  or  any  freshness  conditions  on  variables: 
All  necessary  variable  renaming  is  handled  by  two  stan¬ 
dard  notions  of  substitution  [2],  one  for  object-level  variables 
(#2  G  Z)  and  one  for  meta-level  variables  (x  G  X). 

7.4  Reduction  Semantics  (Ad) 

Figure  1  defines  the  reduction  semantics  for  Core  DALI. 

8.  SUMMARY  OF  TECHNICAL  DEVELOP¬ 
MENT 

The  main  technical  result  of  our  work  to  date  is  establish¬ 
ing  the  confluence  property  for  the  reduction  semantics  de¬ 
scribed  above,  and  establishing  (the  rather  non-trivial  con¬ 
nection)  between  the  reduction  semantics  and  big-step  se¬ 
mantics.  In  doing  so,  we  have  following  closely  Taha’s  de¬ 
velopment  for  the  (substantially  smaller)  language  A  —  U 


Syntax: 


Infinite  set  of  names 
Infinite  set  of  names 

Infinite  set  of  names  containing  True  and  False 
Finite  subsets  of  F 

()  |  x  |  Ax.e  |  e  e  |  (e,e)  |  7Ti  e  |  7t2  e  |  /  e  |  A feF(f  xf).ef  \ 

#z  |  #z  =>  e  |  A (#z  =>  x).e  |  isOVar  e  |  e  =#  e 
0  |  Ax.C  \  C  e  \  e  C  \  (e,C)  \  (C,e)  \  7Ti  C  \  7t2  C  \ 
a/€F-{/T((/.  Xi).Ci)  +  +(/'  *).C  |  /  C  |  (#*  =►  C)  |  A(#*  =»  *).C  | 
isOVar  C  |  C  =#  e  |  e  =#  <7 
()I(M)|/6|#*|#*=»6 

0  j  Ax.e  |  (v,v)  \fv  \  Xf€F  f  Xf.ej  |  #2  |  #z  =>  v  \  X(#z  =>  x).e 

ft  |  7Ti  |  7T2  |  ft  |  ft  |  #  |  IsOVar 
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$zl  =#  $z2 
isOVar  #2 
isOVar  u 

Reduction  Semantics: 

el  - e2  ~ 

C[ei]  — ►  C[e2]P  6 


Figure  1:  Syntax  and  Reduction  Semantics  (Ad)  of  Core  DALI 
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[45;  47].  Taha’s  development  is  based  on  Takahashi  parallel 
reduction  and  complete  development  methods  for  proving 
confluence  [52],  and  Plot  kin’s  “standardization”  technique 
for  showing  that  reductions  preserve  observational  equiva¬ 
lence. 

This  section  summarizes  our  technical  development  and 
states  our  main  result,  and  explains  how  they  were  useful 
to  us  in  the  process  of  designing  the  semantics  for  DALI. 
The  full  details  cannot  be  included  in  this  paper,  and  are 
presented  instead  in  a  technical  report  available  on-line  [36, 
40  pages]. 

8.1  Confluence 

The  first  result  is  confluence: 

Theorem  1  (Ad  Is  Confluent).  Vei,e2,eeE. 
ci*  <—  e  — ►*  e2  =>  (Be  G  E.  ei  — >*  e*  *—  e2) 

First,  we  are  not  aware  of  a  similar  proof  for  a  language  with 
datatypes.  Furthermore,  this  result  establishes  the  existence 
of  a  confluent  calculus  for  a  language  with  notion  of  object- 
level  binders,  and  analysis  on  these  terms.  In  particular, 
this  result  means  that  DALI  also  provides  a  solution  to  the 
problem  of  introducing  intensional  analysis  to  MetaML  in  a 
“coherent”  manner  [47]. 

8. LI  Role  in  Design  ofDAhl 


In  addition  to  its  technical  role  in  arriving  at  our  next  re¬ 
sult,  establishing  the  confluence  property  played  an  impor¬ 
tant  role  in  our  design  process:  It  drew  our  attention  to  the 
need  for  introducing  the  notion  of  ground- values,  thereby 
prohibiting  any  useful  mixing  of  object-binder  and  function 
spaces  in  datatypes. 

In  particular,  analysis  over  object-level  binders  (ft  re¬ 
duction)  without  the  restriction  of  the  argument  to  ground 
values  breaks  the  confluence,  as  is  illustrated  in  the  following 
example: 


Suppose  the  notion  of  reduction  — >p3  (Figure  1)  were 
defined  as  follows  (we  emphasize  the  part  different  from  the 
standard  definition  by  placing  it  into  a  box): 


(A (#2'  =>  x).e)  (#2  =*•  v)  — e[x  :=  A y.v[#z  :=  y]] 


Now,  consider  the  function  /  =  (A(#w  =>  x).x  (Xu.u)). 
This  function  takes  an  object-level  binding  as  its  argument 
and  returns  the  body  of  the  binding  in  which  the  object- 
bound  variable  has  been  replaced  with  the  identity  function 
Xu.u.  For  the  application  of  /  to  the  object  binding  (#z  => 
(A y.#z  =  #2)),  there  are  two  possible  reduction  sequences: 

/  (#z  =>  (A y-#z  =  #z))  — ► 03  Xy.(Xu.u)  =  ( Xu.u ) 
and 

/  (#2  =*•  (A y.#2  =  #2))  — >#  /  (#2  =►  (Ay.TrueQ)) 

— >03  Ay.True() 

Clearly,  neither  Ay.TrueQ,  nor  Ay.(Au.u)  =  (Au.u)  can  be 


ei  t->  Xx.e  e\  ^  X^Fu^(f  x f).e/  e\  \{#z  =>■  x).e 

e2  *-4  63  e2  ^  k  e\  e2  *-*■  #z  =>  63 

e[z  :=  e3]  *->  e4  efc[x  :=  e4]  e5  e[rc  :=  Xxf  .(bs[#z  :=  re'])]  e4 


()  «-4-  ()  Az.e  <->  Ax.e  ei  e2  *->  e4  ei  e2  <-4  es 

ci  <-4  e3  e2 ‘-4  e4  e^(e3,e4)  ei  <-4  e2 


ei  e2  *— >  e4 


(ei,e2)  <->  (e3je4)  7Ti  e  ^4  e3  7r2  e  ^-4  e4  /*  ei  <-4 /*  e2  \f€Ffxf.ef<-+\ieFfxf.ef 


ei  ^4  e2 


ci  «->  #z 
e2  <->  #z 


ei  <-4  #21 

_ _  e2  ^  #z2  zi  ^  z2 

■fizz  ^4  #z  #z  =>  ei  ^4  #z  =>  e2  A(z  =>  z).e  ^4  A(z  =>  x).e  ei  =#  e2  True()  ei  =#  e2  <-4  False() 

e  *4-  ffz  e4v  v  ^  #z 

isOVar  e  *-4  True()  isOVar  e  *-4  FalseQ 

Figure  2:  Big-Step  Semantics  (AD)  of  Core  DALI 


further  reduced  by  Xd  to  a  common  reduct:  a  clear  coun¬ 
terexample  for  confluence. 

Finally,  note  that  the  breakdown  of  confluence  here  pro¬ 
vides  a  concrete  illustration  of  one  of  the  wide  range  of  diffi¬ 
culties  that  can  arise  from  mixing  function  spaces  with  “syn¬ 
tax”.  Other  examples,  such  as  the  discussion  of  “covers”  in 
the  context  of  MetaML  implementation  [45]  require  much 
more  infrastructure  to  present. 

8.2  Soundness 

We  will  consider  two  programs  to  be  equivalent  when  they 
can  be  interchanged  in  any  context  without  affecting  the 
termination  (or  non-termination)  of  the  full  term  in  which 
they  occur.  This  is  known  as  observational  (or  contextual) 
equivalence,  and  is  defined  as  follows: 

Definition  2  (Observational  Equivalence).  We  write 
ei  «  e2  if  and  only  if 

VC  €  C.  (3v  6  V.  C[ei]  «-4  v)  (3v  €  V.  C[e2]  v) 

Our  soundness  result  can  now  be  stated  as  simply: 

Theorem  3  (Soundness). 

Vei ,  e2  €  E.  e\  — >  e2  ==>  e\  ss  e2 

First,  our  proof  for  this  theorem  is  the  first  operational  ac¬ 
count  known  to  us  where  the  soundness  of  such  reductions 
for  an  untyped  CBV  functional  language  with  datatypes  is 
established  (Using  bisimilarity  techniques,  Pitts  does  present 
a  similar  result,  but  for  a  typed  CBN  language  supporting 
binary  sum  types  [41].) 

Second,  the  soundness  of  these  results  establishes  that 
extending  the  lambda  calculus  plus  datatypes  with  DALI’s 
constructs  for  introducing  and  analyzing  object-level  binders 
and  free  variables  at  runtime  does  not  injure  the  notion  of 
observational  equivalence  in  a  devastating  way.  Certainly, 
it  may  very  well  be  that  introducing  the  new  constructs 
allows  us  to  distinguish  between  more  terms  in  the  language 
(as  does  introducing  exceptions,  for  example),  and  this  is  a 
question  for  future  work. 


8.2.1  Role  in  Design  ofDkhl 

The  immediate  technical  benefit  of  this  result  is  providing 
technical  justification  for  using  the  reductions  as  semantics- 
preserving  optimizations  in  an  implementation.  But  there 
are  other  benefits  that  we  are  interested  in  from  the  point 
of  view  of  language  design: 

1.  It  provides  us  with  a  basic  understanding  of  the  no¬ 
tion  of  observational  equivalence.  In  particular,  in  the 
case  of  this  language  (as  is  in  the  case  for  many  deter¬ 
ministic  languages),  one  arrives  at  a  simple  equational 
theory  simply  be  changing  reduction  arrows  into  “con¬ 
vertibility”  equalities. 

2.  Taha’s  development [45;  47]  emphasizes  partitioning  ex¬ 
pressions  into  values,  workables,  and  stucks,  and  estab¬ 
lishing  “monotonicity  properties”  from  which,  for  ex¬ 
ample,  Wright  and  Felleissen’s  “Uniform  Evaluation” 
[53]  follows.  Thus,  not  only  do  we  provide  the  basis 
for  posing  the  question  of  “what  is  a  type  system  for 
datatypes  with  binder” ,  we  already  provide  some  of  the 
technical  properties  needed  in  establishing  type  safety 
for  any  type  system  that  we  may  wish  to  investigate. 

3.  Attaining  this  result  involves  constructing  a  number 
of  variations  of  the  operational  semantics,  and  relat¬ 
ing  them  formally.  This  process  provides  a  substantial 
amount  of  cross-checking  between  various  definitions, 
and  gives  a  very  accurate  operational  understanding 
of  the  kind  of  invariants  that  a  type  system  will  be 
expected  to  guarantee. 

9.  RELATED  WORK 

DALI  is  a  functional  meta-programming  language,  and  is 
related  as  such,  to  many  other  meta-systems. 

Meta-systems  built  with  a  functional  programming  base 
include  MetaML  [47;  51],  AD[12]  and  A°[ll].  These  differ 
from  DALI  in  that  they  are  homogeneous  systems,  where 
the  meta-  and  object-languages  are  the  same.  None  of  these 
systems  provide  mechanisms  for  analyzing  the  structure  of 
object-programs. 


Theorem  prover  based  meta-systems  have  been  constructed 
for  several  kinds  of  logics.  Implementations  of  classical  log¬ 
ics  include  the  HOL  [18]  theorem  prover,  Isabelle  [35],  and' 
the  Prototype  Verification  System  (PVS)  [34].  Implemen¬ 
tations  of  constructive  (or  intuitionistic)  logics  include  Elf 
[37],  Coq  [10;  3],  Nuprl  [9]  and  Lego  [42]. 

Finally,  there  are  logic  programming  languages  with  meta¬ 
programming  extensions,  A-Prolog  [31;  15;  27],  and  L\  [24]. 
These  are  prolog-like  languages  with  extensions  for  repre¬ 
senting  and  analyzing  object-programs  whose  representa¬ 
tions  are  based  on  the  A-calculus. 

Of  these  systems,  Isabelle,  Elf,  A-Prolog,  and  L\  use 
some  sort  of  higher-order  abstract  syntax  to  represent  object 
terms.  Of  these,  all  but  La,  use  higher  order  unification  to 
implement  intensional  analysis  of  object  terms.  Higher  order 
unification  is  in  general  undecidable,  and  does  not  guarantee 
a  most  general  unifier. 

La  implements  a  subset  of  lambda-Prolog,  where  inten¬ 
sional  analysis  is  syntactically  restricted  to  a  form  which  is 
decidable  using  unification  on  higher-order  patterns.  It  is 
this  idea  transferred  to  the  functional  programming  world 
that  is  the  basis  for  ML\  and  DALI. 

The  term  higher- order  abstract  syntax  was  originated  by 
Pfenning  and  Elliott  [38].  This  work  provided  a  basis  for  au¬ 
tomating  reasoning  in  LF[19],  and  was  used  as  the  basis  for 
the  implementation  of  Pfenning’s  Elf[37]  and  its  successor 
Twelf[40]. 

9.1  DalI  vs.  MLa 

Dale  Miller  [23]  describes  MLa,  a  proposal  to  extend  ML 
to  handle  bound  variables  in  data-types.  The  idea  of  repre¬ 
senting  object-level  bindings,  in  a  functional  language,  using 
a  binding  construct  different  from  the  function  abstraction 
of  the  meta-language  derives  from  this  paper.  While  our 
work  takes  Miller’s  proposed  extensions  as  its  basis,  there 
are  some  differences: 

•  We  distill  the  main  ideas  of  Miller’s  MLa  into  a  basic 
calculus  of  core  DALI.  We  concentrate  on  the  the 
reduction  semantics  and  equational  theories  of  such 
a  language.  To  the  authors’  knowledge,  this  work  is 
the  first  instance  of  a  sound  reduction  semantics  for  a 
functional  language  supporting  binding  constructs  in 
data-types. 

•  We  abandon  the  notion  of  function  extension  that  al¬ 
lows  extending  the  domain  of  arbitrary  ML  functions 
within  the  scope  of  an  object-level  variable.  We  find 
function  extension  needlessly  difficult  to  model  in  a  re¬ 
duction  system,  and  seek  to  introduce  an  alternative 
construct:  patterns  that  match  object-level  variables. 
We  conjecture  that,  together  with  equality  over  object- 
level  variables,  one  can  circumvent  function  extension 
without  loss  of  expressiveness  or  good  programming 
style. 

•  We  abandon  the  notion  of  object-level  application  [23]. 
Rather,  pattern  matching  on  object-level  bindings  binds 
higher-order  pattern  variables  to  functions  that  per¬ 
form  appropriate  substitutions  directly,  thus  further 
simplifying  formal  development,  and,  in  practical  terms, 


internalizing  object-level  variable  substitution,  which 
in  [23]  must  be  defined  separately  for  each  data- type. 

However,  internalization  of  such  object-level  substitu¬ 
tion  in  presence  of  extensional  function  values  is  not 
without  cost:  we  had  to  resort  to  a  fine  distinction  be¬ 
tween  ground  (or  equality)  values  and  the  more  stan¬ 
dard  notion  of  values  in  such  calculi,  and  adjust  eval¬ 
uation  and  reduction  to  restrict  the  analysis  of  object- 
language  terms  to  preserve  soundness  and  confluence 
of  the  calculus. 

DALI  differs  from  most  of  the  other  work  discussed  above 
in  following  ways: 

•  It  is  functional  and  deterministic,  and  is  presented  as 
an  extension  of  a  standard  CBV  functional  language. 
It  provides  support  for  higher-order  syntax  by  provid¬ 
ing  a  small  number  of  new  language  constructs. 

•  The  formal  properties  we  have  proven  about  the  lan¬ 
guage  suggest  that  the  new  features  integrate  well  with 
the  host  functional  language. 

•  The  reduction  semantics  we  provide  gives  rise  to  a  sim¬ 
ple  equational  theory  that  can  be  used  to  reason  about 
program  equivalence. 

10.  CONCLUSIONS  AND  FUTURE  WORK 

In  this  paper  we  have  shown  that  a  functional  program¬ 
ming  language  with  support  for  higher  order  abstract  syn¬ 
tax  through  an  additional  object-level  binding  construct  can 
be  assigned  a  simple  big-step  semantics.  We  have  defined 
a  reduction  semantics  and  presented  important  results  of 
confluence  and  soundness  w.r.t.  evaluation  of  this  reduction 
semantics  for  DALI.  After  this  initial  success  much  work 
remains  to  be  done.  In  particular: 

•  Developing  a  basic  type  system  for  DALI.  In  addition 
to  the  traditional  notions  of  safety  there  are  some  ef¬ 
ficiency  concerns  that  we  expect  that  a  type  system 
could  be  used  to  alleviate.  In  particular,  the  discharge 
operation  and  the  use  of  the  ground-value  restriction 
b  in  the  semantics  would  incur  significant  run-time 
penalties  in  an  implementation.  We  expect  that  an 
appropriate  type  system  could  help  avoid  these. 

•  Integrating  with  multi-stage  programming.  In  partic¬ 
ular,  DALI  meta-programming  utility  is  orthogonal  to 
that  of  multi-stage  programming  [49;  48;  48;  29;  4]: 
with  DALI,  the  object  language  is  allowed  to  vary,  and 
intensional  analysis  is  supported.  Note,  however,  that 
DALI  does  support  the  hygienic  synthesis  of  object 
code,  although  in  a  manner  less  concise  than  those  of 
multi-stage  programming  languages.  Finally,  whereas 
it  has  been  demonstrated  that  the  former  can  guaran¬ 
tee  that  the  synthesized  code  is  type  correct,  the  only 
guarantee  that  we  have  at  the  moment  with  DALI  is 
that  the  synthesized  code  is  syntactically  correct. 

•  An  implementation  of  a  full  programming  language  en¬ 
vironment  based  on  DALI.  Although  a  full  implemen¬ 
tation  of  DALI  is  missing  at  the  moment,  the  mech¬ 
anisms  of  higher-order  pattern  matching  and  analysis 


of  object-level  bindings  has  been  implemented  by  Tim 
Sheard  as  an  experimental  feature  of  the  MetaML  in¬ 
terpreter  [43]. 

From  the  point  of  view  of  semantic  language  design,  in 
reproducing  Taha’s  technical  development  of  MetaML,  we 
have  found  that  all  the  proofs  could  be  carried  out  in  a 
systematic  manner  for  the  (considerably  larger)  language 
at  hand,  and  that  many  of  the  proofs  remain  literally  un¬ 
changed.  This  seems  to  be  primarily  due  to  the  use  of  the 
notion  of  “workables”  in  parameterizing  the  various  lem¬ 
mata.  In  future  work,  we  intend  to  investigate  the  extent  to 
which  this  development  can  be  generalized. 
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